Thermodynamics' first law: what information theory tells us 
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Thermodynamics, and in particular its first law, is of fundamental importance to Science, and 
therefore of great general interest to all physicists. The first law, although undoubtedly true, and 
believed by everyone to be true because of its many verified consequences, rests on a rather weak 
experimental foundation as its path independent aspect has never been directly verified, and rests 
on a somewhat weak foundation apropos the need for invoking the so-called adiabatic theorem 
(AT) to prove it from first principles. We provide here a more direct and convincing theoretical 
demonstration, without the AT and some other usually employed axioms. 
PACS: 05.30.-d, 05.30.Jp 



INTRODUCTION 

Jaynes' pioneering 1957 papers in The Physical Review 
P| constitute a new theoretical foundation for the devel- 
opment of statistical mechanics providing us with a solid 
alternative on the basis of information theory (IT) 
One of the main ingredients in Jaynes 's treatment was 
an intensive use of the principle of parsimony (PP) [3|. 
Our purpose here is to present an original derivation of 
thermodynamics' first law from an IT viewpoint and, at 
the same time, provide a pedagogical example illustrating 
the principle of parsimony that explicitly invokes Jaynes' 
discovery (of more than forty years ago) Q, Q , that no 
reference to equilibrium needs to be made in order to deal 
completely and successfully with all of thermodynamics. 
To some physicists, even theoreticians, the idea sounds 
revolutionary even today! 

The PP, or Ockham's razor, is a basic methodological 
principle that governs scientific endeavor Q. It dictates 
simplicity in theory construction, as for instance, in the 
number of axioms, or of parameters, involved in a theo- 
retical construct. 

As stated above, we will here apply this principle with 
regards to the usual IT-treatment of the first law of the 
thermodynamics 3J. Appeal to Ockham's razor will yield 
a simpler derivation than the usual text-book one. 



JAYNES' APPROACH AND OCKHAM'S RAZOR 

The orthodox formulation of statistical mechanics is due 
to Gibbs' working on a classical mechanics substra- 
tum. It is based upon the following set of axioms |fj 

1. Ensemble postulate: the system at equilibrium can 
be represented by an appropriately designed ensem- 
ble. 

2. Equal a priori probabilities for cells in phase space. 



3. The phase space probability distribution depends 
only on the system's Hamiltonian. 

4. This dependence is of exponential form. 

Jaynes reformulated statistical mechanics in 1957 [l| by 
recourse to information theory concepts with quan- 
tum mechanics now providing the background. Instead of 
a distribution function in phase space we now use a den- 
sity operator p to describe our system, p is obtained via 
the so called MaxEnt principle, namely, the constrained 
(Lagrange) maximization of Shannons's logarithmic in- 
formation measure S, regarded as a measure of ignorance 
0, with 



S = — fesTrplnp, 



(1) 



where fcs is the Boltzmann constant, to be set equal to 
unity from now on. Jaynes' basic axiom or postulate 
reads: the density operator that describes our system is 
that provided by the MaxEnt principle. One might argue 
that this postulate explicitly assumes Shannon's entropy 
logarithmic form, but such an statement can be refuted 
by pointing out that other forms have been used in the 
literature to this effect with great success 0, ■ A second 
axiom that one needs, however, is that p depends explic- 
itly only on expectation values and implicitly on some 
hamiltonian. 

Most interestingly, no reference to either i) equal a priori 
probabilities or ii) equilibrium, needs to be made. On IT 

grounds, equilibrium refers to the state of knowledge of 
the observer, it is not an intrinsic property of the sys- 
tem. Information theory is essentially concerned with 
cpistcmology: equilibrium means that one's knowledge 
is restricted to constants of the motion, so that one can 
forget about dynamics 0] • The equilibrium notion plays 
no part whatsoever in our considerations. 
On Ockham's razor grounds, one might argue that 
Jaynes' number of postulates is smaller that Gibbs'. In 
particular, since no mention of "equilibrium" is made, 
the associated theory has, at least potentially, a wider 
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outreach than that of Gibbs' 0- It is perhaps neces- 
sary to point out, at this point, that an entirely dif- 
ferent information theory approach to non-equilibrium 
thermodynamics, based upon Fisher's measure (a kind 
of "Fisher-MaxEnt" ) , has recently been advanced that 
exhibits definite advantages over both Gibbs' and (the 
original) Jaynes' treatments 0,0,0- 
Returning to Jaynes' approach, assume that we deal 
with a system with Hamiltonian H. The system 
is characterized by the set of operators {Oi} (i 
1 M, with 0\ = H) in the sense that one is sup- 
posed to know the expectation values of these operators. 
In other words 



Tr[p] = 



(i=l,...,M) 



(2) 



constitutes our a priori information concerning the sys- 
tem. We wish to find the appropriate, most unbiased p 
that reproduces this amount of information and other- 
wise maximizes our ignorance. The truth, all the truth, 
nothing but the truth |3j. Using any other p is tanta- 
mount to inventing information that we actually do not 
possess. Extremizing then Q subject to the constraints 
(0 leads to a density operator of the form 3] 



p = 



(3) 



where the {A^} is a set of Lagrange multipliers (Ai = (3 is 
the inverse temperature) that arise during the Lagrange 
process. The {A^} are associated with the above expec- 
tation values that represent our foreknowledge 3]. The 
normalization factor in Eq. 



Z = Tr [ e -2T A *°*] 



(4) 



is the partition function y|. The formalism allows for 
arbitrary variations in the expectation values to be car- 
ried out. Let us insist: thermodynamics has been derived 
more than 40 years ago from the IT formalism. If one 
relies on IT, it is clear that, epistemologically, no addi- 
tional thermodynamic notions are to be presupposed in 
advance. Notice that predictions derived from the IT- 
formalism amply exceed the scope of themes that con- 
ventional thermodynamics is able to deal with Q, 0] ■ 
In order to obtain the first law, the usual text-book ap- 
proach analyzes the variation of the internal energy 
U, which is regarded as a functional of both i) the density 
operator and ii) the Hamiltonian. 



OUR PRESENT GOAL 

We will show in this work that it is possible to perform 
a different treatment that considers the internal energy 



as a functional of solely the density operator. According 
to Ockham's razor, this way of handling the first law is 
to be preferred to the traditional one, since now one can 
dispense with, in order to describe the thermodynamic 
work W, the two following theoretical assumptions (or 
needs) of the traditional approach [j| : 



Assumptions that will be no longer needed 

• reference to "equilibrium" 

• explicit dependence of the Hamiltonian on some 
"external" parameters %, 

• recourse to the adiabatic theorem (AT) 

We can summarize the AT's contents as follows Q: let 
us regard the Hamiltonian as depending on a parameter 
X that evolves in time from an initial value \i to a final 
value Xii during a time-interval r, in the fashion 



X(t) = Xi + ~ (X2 

T 



xi); (x(o) = xi, x{r)=X2). (5) 



In the limit of an exceedingly slow, physically unrealiz- 
able x-change (i.e., for r — > oo), the time evolution that 
H(x(t)) generates during the temporal interval [0, r] is 
such that, if the system is represented at t — by an 
eigenstate of H(xi), it will be found in an eigenstate of 
H{xi) at t = t. Indeed, at any time t in [0, r] it will be 
encountered in an eigenstate of H(x(t)). 



A RELATION FOR dU 

The possibility of eliminating recourse to the AT should 
be a relief to many who consider it a somewhat suspect 
subterfuge: to our knowledge there has been no direct, 
assumption free experimental confirmation of the claim 
that the final state of a system is really independent of 
its path from the initial state when work is performed 
[l2| . It goes without saying that the consequences of the 
first law are so well established that its validity is beyond 
any reasonable doubt Nevertheless, in the light of 

this situation a new proof of the first law, specially one 
that avoids the awkward AT, should be of great interest. 

We now proceed to derive a relationship for dU. For this 
we need to deal with the internal energy U, that is, with 
a special case (i = 1) of |2J 



U=(0 1 ) = (H)=Tr{pH] 



(6) 



and consider variations Sp of the density operator, whose 
normalization entails 



Tr[8p] = 8Tr[p] =Si = 
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Tr[8p\i\p\ = STr[p In p\. 
We also have, of course, 

Tr[8pd % ] = 5(d 1 
Thus, we confront finding (Cf. (JSJ) 

dU = S(0 1=1 ) = Tr[5pH] 



(7) 



(8) 



(9) 



Appropriate manipulation of Eq. © allows one now to 
write H in the fashion 



WHAT IS XI 



In p + In Z = — 



i>l 



(10) 



and thus 



H 



O k 



y I In p + ]nZ + J2*idi) 



(11) 



so that, replacing H into Eq. ©, and minding also I]7l8p . 
yields 



(12) 



The first term in Eq. (|12|) is now to be recast in terms 
of the entropy of the system, as given by Eq. (JJ, which 
leads one to 



dU = TdS -J2 TX > S 



(13) 



i>l 



where T = 1//? is the temperature of the system. More 
generally, one also has 



Afc ^ Afc x 



(14) 



We make now the identification 

for heat (Q) change : d'Q = TdS, (15) 
and we arrive at the promised relationship for dU 



dU = d'Q + dX, 



(16) 



withX = -TJ2 t> i \6(0. 



The derivation of l|16|) is straightforward. We are left 
with the interpretation of X. Let us delve a little longer 
on the meaning of (|16|l . We have assumed that our a 
priori information has slightly changed: 



From (Oi) = a,i to 
(Oj) +S(di) = Oi + San (i 



1,...,M). (17) 



Necessarily then, the MaxEnt methodology yields a new 
density operator p + 5p, which, of course, entails in turn 
a change in the Lagrange multipliers 

From Aj to A,; + 5\ ; (i = l,...,M). (18) 

The essential IT-content of the first law (Cf. Eq. ifTBJll 
is that i) the S(Oi) are not independent quantities (Cf. 
Eq. I|14f> ) an d h) they can be expressed solely in terms 
of the Ai. Of course, if one wishes to predict the value of 
(A), an operator not included in the set {Oi}, one would 
need the SXi. 

If we call the differential of work (dW), effected at tem- 
perature T, 



dW 



(19) 



we obviously obtain, without further ado, the first law 
of thermodynamics in the fashion 116|) . where heat (d'Q) 
and work (dW) terms acquire their traditional aspect. If 
we did not accept, for whatever the reason, the interpre- 
tation (|19f) . we could not avoid the conclusion that the 
difference dU — d'Q has two forms: one of them follows 
from l|16fl and always holds. In some particular instances, 
we would, in addition, have the conventional first law. 
On Ockham grounds, the first alternative, namely, Eq. 
(US}, is clearly preferable. Consider the simple classi- 
cal example posed by a probability distribution /(t) in 
phase-space (volume element dr), with two constraints: 

/(r) = Z- 1 exp(-[/3i/(T)+p/30(r)]) 
J drf(r)H(r) = U; J dr } \r) 4>{r) = V 

(V = Volume] p = pressure), (20) 



where application of Eq. (|16H immediately yields 
dU = TdS-pdV. 



DISCUSSION AND CONCLUSIONS 



(21) 



We have shown that, within Jaynes' information theory 
context, one may derive thermodynamics' first law with- 
out appeal to the adiabatic theorem 3] or to a explicit 
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dependence of the pertinent Hamiltonian on hypothet- 
ical external parameters. This agrees with both Ock- 
ham's razor and the Jaynes' philosophy ,2j. Thus we 
avoid the slightly paradoxical contradiction between si- 
multaneously stating 

• on the one hand, that p contains all the available 
information concerning the system, and, 

• on the other one, needing to add, to the theoret- 
ical description, putative infinitely slowly varying 
external parameters to obtain the first law. 

There is no need to invoke the adiabatic theorem because, 
interestingly enough, the formalism itself demands that 
the process be undertaken at a constant temperature T 
(Cf. Eqs. (|15|) - (|19[l ) that arises automatically in the 
constrained Lagrange extremization. 
For a system characterized by the set of operators 
[{Oi\ (i = 1,...,M); di = H], in the sense that 
we know a priori the pertinent expectation values Q, 
we have here shown that the Jaynes treatment, in the 
present light, implies that work is represented by changes 
in the expectation values [{(Oi)}; (i = 2, . . . , M)]. These 
constituted part of our prior knowledge. If a posteriori 
we encounter changes, this entails that work has been 
performed, on or by the system. 
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